Facial Keypoint Detection


In this project, I have built an end-to-end facial keypoint recognition system. Facial keypoints include points around the eyes, nose, and mouth on a face and are used in many applications such as emotion recognition, snapchat filters etc.

Part 1 : Investigating OpenCV, pre-processing, and face detection

  • Step 0: Detect Faces Using a Haar Cascade Classifier
  • Step 1: Add Eye Detection
  • Step 2: De-noise an Image for Better Face Detection
  • Step 3: Blur an Image and Perform Edge Detection
  • Step 4: Automatically Hide the Identity of an Individual

Part 2 : Training a Convolutional Neural Network (CNN) to detect facial keypoints

  • Step 5: Create a CNN to Recognize Facial Keypoints
  • Step 6: Compile and Train the Model
  • Step 7: Visualize the Loss

Part 3 : Putting parts 1 and 2 together to identify facial keypoints on any image

  • Step 8: Build a Robust Facial Keypoints Detector

Step 0: Detect Faces Using a Haar Cascade Classifier

Classification problem is a problem of distinguishing between distinct classes of things. With face detection these distinct classes are 1) images of human faces and 2) everything else.

I have used OpenCV's implementation of Haar feature-based cascade classifiers to detect human faces in images. OpenCV provides many pre-trained face detectors, stored as XML files on github.

Import Resources

In [1]:
# Import required libraries 

%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
import math
import cv2
from PIL import Image
import time
import pandas as pd

By default OpenCV assumes the ordering of our image's color channels are Blue, then Green, then Red. This is slightly out of order with most image types I have used in these experiments, whose color channels are ordered Red, then Green, then Blue. In order to switch the Blue and Red channels of our test image around I have used OpenCV's cvtColor function.

In [2]:
# Load in color image for face detection
image = cv2.imread('images/test_image_1.jpg')

# Convert the image to RGB colorspace
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Plot our image using subplots to specify a size and title
fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Original Image')
ax1.imshow(image)
Out[2]:
<matplotlib.image.AxesImage at 0x2a281a69348>

There are 13 faces in this picture. I have used a Haar Cascade classifier to detect all the faces in this test image.

This face detector uses information about patterns of intensity in an image to reliably detect faces under varying light conditions. So, to use this face detector, I have first converted the image from color to grayscale.

Then, the trained architecture of the face detector is loaded and used to find faces.

In [3]:
# Convert the RGB  image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# Extract the pre-trained face detector from an xml file
face_cascade = cv2.CascadeClassifier('detector_architectures/haarcascade_frontalface_default.xml')

# Detect the faces in image
faces = face_cascade.detectMultiScale(gray, 4, 6)

# Print the number of faces detected in the image
print('Number of faces detected:', len(faces))

# Make a copy of the orginal image to draw face detections on
image_with_detections = np.copy(image)

# Get the bounding box for each detected face
for (x,y,w,h) in faces:
    # Add a red bounding box to the detections image
    cv2.rectangle(image_with_detections, (x,y), (x+w,y+h), (255,0,0), 3)
    

# Display the image with the detections
fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Image with Face Detections')
ax1.imshow(image_with_detections)
Number of faces detected: 13
Out[3]:
<matplotlib.image.AxesImage at 0x2a282731ac8>

Step 1: Add Eye Detections

To test the eye detector, I have first read in a new test image just a single face.

In [4]:
# Load in color image for face detection
image = cv2.imread('images/james.jpg')

# Convert the image to RGB colorspace
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Plot the RGB image
fig = plt.figure(figsize = (6,6))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Original Image')
ax1.imshow(image)
Out[4]:
<matplotlib.image.AxesImage at 0x2a281a866c8>

Though the image is a black and white image but then we have read it in as a color image and so it is needed to be converted to grayscale in order to perform the most accurate face detection.

So, the next step is to convert this image to grayscale, then load OpenCV's face detector and run it with parameters that detect this face accurately.

In [5]:
# Convert the RGB  image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# Extract the pre-trained face detector from an xml file
face_cascade = cv2.CascadeClassifier('detector_architectures/haarcascade_frontalface_default.xml')

# Detect the faces in image
faces = face_cascade.detectMultiScale(gray, 1.25, 6)

# Print the number of faces detected in the image
print('Number of faces detected:', len(faces))

# Make a copy of the orginal image to draw face detections on
image_with_detections = np.copy(image)

# Get the bounding box for each detected face
for (x,y,w,h) in faces:
    # Add a red bounding box to the detections image
    cv2.rectangle(image_with_detections, (x,y), (x+w,y+h), (255,0,0), 3)
    

# Display the image with the detections
fig = plt.figure(figsize = (6,6))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Image with Face Detection')
ax1.imshow(image_with_detections)
Number of faces detected: 1
Out[5]:
<matplotlib.image.AxesImage at 0x2a28274f3c8>

Add an eye detector to the current face detection setup.

To set up an eye detector, I have used the eye cascade detector.

In [6]:
image_with_detections = np.copy(image)   

# Loop over the detections and draw their corresponding face detection boxes
for (x,y,w,h) in faces:
    cv2.rectangle(image_with_detections, (x,y), (x+w,y+h),(255,0,0), 3)  

    
# Print the number of faces detected in the image
print('Number of faces detected:', len(faces))

eye_cascade = cv2.CascadeClassifier('detector_architectures/haarcascade_eye.xml')

# Detect the faces in image
eyes = eye_cascade.detectMultiScale(gray, 1.02, 3)

print('Number of eyes detected:', len(eyes))

for (x,y,w,h) in eyes:
    cv2.rectangle(image_with_detections, (x,y), (x+w,y+h),(0,255,0), 3)  


fig = plt.figure(figsize = (6,6))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Image with Face and Eye Detection')
ax1.imshow(image_with_detections)
Number of faces detected: 1
Number of eyes detected: 2
Out[6]:
<matplotlib.image.AxesImage at 0x2a282789e88>

Step 2: De-noise an Image for Better Face Detection

Image quality is an important aspect of any computer vision task. Typically, when creating a set of images to train a deep learning network, significant care is taken to ensure that training images are free of visual noise or artifacts that hinder object detection. While computer vision algorithms - like a face detector - are typically trained on 'nice' data such as this, new test data doesn't always look so nice!

When applying a trained computer vision algorithm to a new piece of test data one often cleans it up first before feeding it in. This sort of cleaning - referred to as pre-processing - can include a number of cleaning phases like blurring, de-noising, color transformations, etc., and many of these tasks can be accomplished using OpenCV.

I have explore OpenCV's noise-removal functionality to clean up a noisy image, which can then be fed into our trained face detector.

Create a noisy image to work with

I have create an artificial noisy version of the previous multi-face image. This is a little exaggerated - we don't typically get images that are this noisy - but image noise or 'grainy-ness' in a digitial image - is a fairly common phenomenon.

In [7]:
# Load in the multi-face test image again
image = cv2.imread('images/test_image_1.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

image_with_noise = np.asarray(image)
noise_level = 40
noise = np.random.randn(image.shape[0],image.shape[1],image.shape[2])*noise_level

image_with_noise = image_with_noise + noise

image_with_noise = np.asarray([np.uint8(np.clip(i,0,255)) for i in image_with_noise])

fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Noisy Image')
ax1.imshow(image_with_noise)
Out[7]:
<matplotlib.image.AxesImage at 0x2a2827cd708>

In the context of face detection, the problem with an image like this is that - due to noise - we may miss some faces or get false detections.

I have applied the same trained OpenCV detector with the same settings as before, to see what sort of detections are obtained.

In [8]:
# Convert the RGB  image to grayscale
gray_noise = cv2.cvtColor(image_with_noise, cv2.COLOR_RGB2GRAY)

# Extract the pre-trained face detector from an xml file
face_cascade = cv2.CascadeClassifier('detector_architectures/haarcascade_frontalface_default.xml')

faces = face_cascade.detectMultiScale(gray_noise, 4, 6)

print('Number of faces detected:', len(faces))

image_with_detections = np.copy(image_with_noise)

for (x,y,w,h) in faces:
    # Add a red bounding box to the detections image
    cv2.rectangle(image_with_detections, (x,y), (x+w,y+h), (255,0,0), 3)

fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Noisy Image with Face Detections')
ax1.imshow(image_with_detections)
Number of faces detected: 11
Out[8]:
<matplotlib.image.AxesImage at 0x2a2823223c8>

De-noise this image for better face detection

I have now de-noised this image enough so that all the faces in the image are properly detected.

In [9]:
denoised_image = cv2.fastNlMeansDenoisingColored(image_with_noise,None,10,10,7,21)#to de-noise the image 

fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('DeNoised Image')
ax1.imshow(denoised_image)
Out[9]:
<matplotlib.image.AxesImage at 0x2a28235ed08>

I have used the trained detector to detect the faces in the image.

In [10]:
gray_noise_denoise = cv2.cvtColor(denoised_image, cv2.COLOR_RGB2GRAY)

faces_denoise = face_cascade.detectMultiScale(gray_noise_denoise, 1.3, 6)

print('Number of faces detected:', len(faces_denoise))

image_with_detections_denoise = np.copy(denoised_image)

for (x,y,w,h) in faces_denoise:
    cv2.rectangle(image_with_detections_denoise, (x,y), (x+w,y+h), (255,0,0), 3)

fig = plt.figure(figsize = (8,8))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('DeNoised Image with Face Detections')
ax1.imshow(image_with_detections_denoise)
Number of faces detected: 13
Out[10]:
<matplotlib.image.AxesImage at 0x2a2823ac348>

Step 3: Blur an Image and Perform Edge Detection

Importance of Blur in Edge Detection

Edge detection is a dimension reduction technique - by keeping only the edges of an image we get to throw away a lot of non-discriminating information. And typically the most useful kind of edge-detection is one that preserves only the important, global structures (ignoring local structures that aren't very discriminative). So removing local structures / retaining global structures is a crucial pre-processing step to performing edge detection in an image, and blurring can do just that. Edge detection is a convolution performed on the image itself.

Canny edge detection

In the cell below I have loaded in a test image, then applied Canny edge detection on it. The original image is shown on the left panel of the figure, while the edge-detected version of the image is shown on the right. The result looks very busy - there are too many little details preserved in the image before it is sent to the edge detector. When applied in computer vision applications, edge detection should preserve global structure; doing away with local structures that don't help describe what objects are in the image.

In [11]:
# Load in the image
image = cv2.imread('images/fawzia.jpg')

# Convert to RGB colorspace
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)  

# Perform Canny edge detection
edges = cv2.Canny(gray,100,200)

# Dilate the image to amplify edges
edges = cv2.dilate(edges, None)

# Plot the RGB and edge-detected image
fig = plt.figure(figsize = (15,15))
ax1 = fig.add_subplot(121)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Original Image')
ax1.imshow(image)

ax2 = fig.add_subplot(122)
ax2.set_xticks([])
ax2.set_yticks([])

ax2.set_title('Canny Edges')
ax2.imshow(edges, cmap='gray')
Out[11]:
<matplotlib.image.AxesImage at 0x2a282840248>

Without first blurring the image, and removing small, local structures, a lot of irrelevant edge content gets picked up and amplified by the detector.

Blur the image then perform edge detection

I have repeated this experiment - blurring the image first to remove these local structures, so that only the important boudnary details remain in the edge-detected image.

In [12]:
orig_img = np.copy(image)
kernel = np.ones((4,4),np.float32)/16
blur = cv2.filter2D(orig_img,-1,kernel)

# Perform Canny edge detection on blurred image
edges_blur = cv2.Canny(blur,100,200)

# Dilate the image to amplify edges
edges_blur = cv2.dilate(edges_blur, None)

# Plot the RGB and edge-detected image
fig = plt.figure(figsize = (15,15))
ax1 = fig.add_subplot(121)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Blurred Image')
ax1.imshow(blur)

ax2 = fig.add_subplot(122)
ax2.set_xticks([])
ax2.set_yticks([])

ax2.set_title('Canny Edges')
ax2.imshow(edges_blur, cmap='gray')
Out[12]:
<matplotlib.image.AxesImage at 0x2a282f6cd88>

Step 4: Automatically Hide the Identity of an Individual

Read in an image to perform identity detection

In [13]:
# Load in the image
image = cv2.imread('images/gus.jpg')

# Convert the image to RGB colorspace
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Display the image
fig = plt.figure(figsize = (6,6))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])
ax1.set_title('Original Image')
ax1.imshow(image)
Out[13]:
<matplotlib.image.AxesImage at 0x2a2833aa3c8>

Use blurring to hide the identity of an individual in an image

The idea here is to 1) automatically detect the face in this image, and then 2) blur it out!

In [14]:
# Convert the RGB  image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# Extract the pre-trained face detector from an xml file
face_cascade = cv2.CascadeClassifier('detector_architectures/haarcascade_frontalface_default.xml')

# Detect the faces in image
faces = face_cascade.detectMultiScale(gray, 4, 5)

# Print the number of faces detected in the image
print('Number of faces detected:', len(faces))

# Make a copy of the orginal image to draw face detections on
image_with_detections = np.copy(image)

# Get the bounding box for each detected face
for (x,y,w,h) in faces:
    # Add a red bounding box to the detections image
    cv2.rectangle(image_with_detections, (x,y), (x+w,y+h), (255,0,0), 3)
    face_crop = image_with_detections[y:y+h, x:x+w]
    

# Display the image with the detections
fig = plt.figure(figsize = (15,15))
ax1 = fig.add_subplot(121)
ax1.set_xticks([])
ax1.set_yticks([])

ax1.set_title('Image with Face Detection')
ax1.imshow(image_with_detections)

## Blur the bounding box around each detected face using an averaging filter and display the result
result_image = np.copy(image)
kernel_2 = np.ones((40,40),np.float32)/1600
blur_2 = cv2.filter2D(face_crop,-1,kernel_2)
result_image[y:y+blur_2.shape[0], x:x+blur_2.shape[1]] = blur_2

ax2 = fig.add_subplot(122)
ax2.set_xticks([])
ax2.set_yticks([])

ax2.set_title('Blurred Image')
ax2.imshow(result_image)
Number of faces detected: 1
Out[14]:
<matplotlib.image.AxesImage at 0x2a282b95a88>

Step 5: Create a CNN to Recognize Facial Keypoints

I have created my own end-to-end pipeline - employing convolutional networks in keras along with OpenCV to detect facial keypoints.

I have started by creating and then training a convolutional network that can detect facial keypoints in a small dataset of cropped images of human faces.

Facial keypoints (also called facial landmarks) are the small blue-green dots shown on each of the faces in the image above - there are 15 keypoints marked in each image. They mark important areas of the face - the eyes, corners of the mouth, the nose, etc. Facial keypoints can be used in a variety of machine learning applications from face and emotion recognition to commercial applications like the image filters popularized by Snapchat.

Make a facial keypoint detector

At a higher level, the facial keypoint detection is a regression problem. A single face corresponds to a set of 15 facial keypoints (a set of 15 corresponding $(x, y)$ coordinates, i.e., an output point). Because the input data are images, I have employed a convolutional neural network to recognize patterns in the images and learn how to identify these keypoint given sets of labeled data.

In order to train a regressor, I have used a training set - a set of facial image / facial keypoint pairs to train on. For this I have used this dataset from Kaggle. The training dataset contains several thousand $96 \times 96$ grayscale images of cropped human faces, along with each face's 15 corresponding facial keypoints that have been placed by hand, and recorded in $(x, y)$ coordinates.

In [16]:
from utils import *

# Load training set
X_train, y_train = load_data()
print("X_train.shape == {}".format(X_train.shape))
print("y_train.shape == {}; y_train.min == {:.3f}; y_train.max == {:.3f}".format(
    y_train.shape, y_train.min(), y_train.max()))

# Load testing set
X_test, _ = load_data(test=True)
print("X_test.shape == {}".format(X_test.shape))
X_train.shape == (2140, 96, 96, 1)
y_train.shape == (2140, 30); y_train.min == -0.920; y_train.max == 0.996
X_test.shape == (1783, 96, 96, 1)

The coordinates of each set of facial landmarks - have been normalized to take on values in the range $[-1, 1]$, while the pixel values of each input point (a facial image) have been normalized to the range $[0,1]$.

Visualize the Training Data

In [17]:
import matplotlib.pyplot as plt
%matplotlib inline

fig = plt.figure(figsize=(20,20))
fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)
for i in range(9):
    ax = fig.add_subplot(3, 3, i + 1, xticks=[], yticks=[])
    plot_data(X_train[i], y_train[i], ax)

For each training image, there are two landmarks per eyebrow (four total), three per eye (six total), four for the mouth, and one for the tip of the nose.

Build the CNN Architecture

A neural network is built for predicting the locations of facial keypoints.

The network accepts a $96 \times 96$ grayscale image as input and outputs a vector with 30 entries, corresponding to the predicted (horizontal and vertical) locations of 15 facial keypoints.

In [18]:
# Import deep learning resources from Keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dropout, GlobalAveragePooling2D
from keras.layers import Flatten, Dense
from keras.layers.normalization import BatchNormalization


# Build a CNN architecture

model = Sequential()
model.add(Conv2D(filters=16, kernel_size=3, activation='relu', input_shape=(96, 96, 1)))
model.add(MaxPooling2D(pool_size=2))

model.add(Conv2D(filters=32, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))

model.add(Conv2D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))

model.add(Conv2D(filters=128, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=2))

model.add(Flatten())

model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))


model.add(Dense(30))


# Summarize the model
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 94, 94, 16)        160       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 47, 47, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 45, 45, 32)        4640      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 22, 22, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 20, 20, 64)        18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 10, 10, 64)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 8, 8, 128)         73856     
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 4, 4, 128)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 2048)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               1049088   
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 30)                15390     
=================================================================
Total params: 1,161,630
Trainable params: 1,161,630
Non-trainable params: 0
_________________________________________________________________

Step 6: Compile and Train the Model

In [19]:
from keras.callbacks import ModelCheckpoint, History
from keras.optimizers import Adam

hist = History()
epochs = 50
batch_size = 64

checkpointer = ModelCheckpoint(filepath='weights.final_2.hdf5', 
                               verbose=1, save_best_only=True)

model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])

hist_final = model.fit(X_train, y_train, validation_split=0.2,
          epochs=epochs, batch_size=batch_size, callbacks=[checkpointer, hist], verbose=1)


model.save('my_model_final.h5')
Train on 1712 samples, validate on 428 samples
Epoch 1/50
1712/1712 [==============================] - 24s 14ms/step - loss: 0.0303 - accuracy: 0.4842 - val_loss: 0.0076 - val_accuracy: 0.6963

Epoch 00001: val_loss improved from inf to 0.00758, saving model to weights.final_2.hdf5
Epoch 2/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0091 - accuracy: 0.6051 - val_loss: 0.0050 - val_accuracy: 0.6963

Epoch 00002: val_loss improved from 0.00758 to 0.00495, saving model to weights.final_2.hdf5
Epoch 3/50
1712/1712 [==============================] - 20s 12ms/step - loss: 0.0068 - accuracy: 0.6320 - val_loss: 0.0046 - val_accuracy: 0.6963

Epoch 00003: val_loss improved from 0.00495 to 0.00460, saving model to weights.final_2.hdf5
Epoch 4/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0061 - accuracy: 0.6525 - val_loss: 0.0044 - val_accuracy: 0.6963

Epoch 00004: val_loss improved from 0.00460 to 0.00442, saving model to weights.final_2.hdf5
Epoch 5/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0058 - accuracy: 0.6454 - val_loss: 0.0042 - val_accuracy: 0.6963

Epoch 00005: val_loss improved from 0.00442 to 0.00419, saving model to weights.final_2.hdf5
Epoch 6/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0055 - accuracy: 0.6624 - val_loss: 0.0039 - val_accuracy: 0.6963

Epoch 00006: val_loss improved from 0.00419 to 0.00390, saving model to weights.final_2.hdf5
Epoch 7/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0051 - accuracy: 0.6513 - val_loss: 0.0035 - val_accuracy: 0.7009

Epoch 00007: val_loss improved from 0.00390 to 0.00347, saving model to weights.final_2.hdf5
Epoch 8/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0043 - accuracy: 0.6595 - val_loss: 0.0029 - val_accuracy: 0.7103

Epoch 00008: val_loss improved from 0.00347 to 0.00288, saving model to weights.final_2.hdf5
Epoch 9/50
1712/1712 [==============================] - 21s 13ms/step - loss: 0.0038 - accuracy: 0.6764 - val_loss: 0.0027 - val_accuracy: 0.7033

Epoch 00009: val_loss improved from 0.00288 to 0.00267, saving model to weights.final_2.hdf5
Epoch 10/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0035 - accuracy: 0.6857 - val_loss: 0.0025 - val_accuracy: 0.7033

Epoch 00010: val_loss improved from 0.00267 to 0.00253, saving model to weights.final_2.hdf5
Epoch 11/50
1712/1712 [==============================] - 27s 16ms/step - loss: 0.0033 - accuracy: 0.6764 - val_loss: 0.0021 - val_accuracy: 0.7173

Epoch 00011: val_loss improved from 0.00253 to 0.00213, saving model to weights.final_2.hdf5
Epoch 12/50
1712/1712 [==============================] - 23s 13ms/step - loss: 0.0030 - accuracy: 0.7074 - val_loss: 0.0024 - val_accuracy: 0.7150

Epoch 00012: val_loss did not improve from 0.00213
Epoch 13/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0029 - accuracy: 0.7033 - val_loss: 0.0021 - val_accuracy: 0.7430

Epoch 00013: val_loss improved from 0.00213 to 0.00207, saving model to weights.final_2.hdf5
Epoch 14/50
1712/1712 [==============================] - 21s 13ms/step - loss: 0.0026 - accuracy: 0.7015 - val_loss: 0.0018 - val_accuracy: 0.7313

Epoch 00014: val_loss improved from 0.00207 to 0.00177, saving model to weights.final_2.hdf5
Epoch 15/50
1712/1712 [==============================] - 20s 12ms/step - loss: 0.0026 - accuracy: 0.7056 - val_loss: 0.0017 - val_accuracy: 0.7266

Epoch 00015: val_loss improved from 0.00177 to 0.00173, saving model to weights.final_2.hdf5
Epoch 16/50
1712/1712 [==============================] - 27s 16ms/step - loss: 0.0023 - accuracy: 0.7120 - val_loss: 0.0017 - val_accuracy: 0.7547

Epoch 00016: val_loss improved from 0.00173 to 0.00166, saving model to weights.final_2.hdf5
Epoch 17/50
1712/1712 [==============================] - 30s 18ms/step - loss: 0.0022 - accuracy: 0.7354 - val_loss: 0.0019 - val_accuracy: 0.7593

Epoch 00017: val_loss did not improve from 0.00166
Epoch 18/50
1712/1712 [==============================] - 28s 17ms/step - loss: 0.0022 - accuracy: 0.7220 - val_loss: 0.0015 - val_accuracy: 0.7687

Epoch 00018: val_loss improved from 0.00166 to 0.00152, saving model to weights.final_2.hdf5
Epoch 19/50
1712/1712 [==============================] - 25s 15ms/step - loss: 0.0020 - accuracy: 0.7173 - val_loss: 0.0015 - val_accuracy: 0.7547

Epoch 00019: val_loss did not improve from 0.00152
Epoch 20/50
1712/1712 [==============================] - 27s 16ms/step - loss: 0.0019 - accuracy: 0.7442 - val_loss: 0.0014 - val_accuracy: 0.7453

Epoch 00020: val_loss improved from 0.00152 to 0.00143, saving model to weights.final_2.hdf5
Epoch 21/50
1712/1712 [==============================] - 28s 16ms/step - loss: 0.0020 - accuracy: 0.7459 - val_loss: 0.0015 - val_accuracy: 0.7687

Epoch 00021: val_loss did not improve from 0.00143
Epoch 22/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0019 - accuracy: 0.7430 - val_loss: 0.0015 - val_accuracy: 0.7827

Epoch 00022: val_loss did not improve from 0.00143
Epoch 23/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0018 - accuracy: 0.7617 - val_loss: 0.0014 - val_accuracy: 0.7547

Epoch 00023: val_loss improved from 0.00143 to 0.00135, saving model to weights.final_2.hdf5
Epoch 24/50
1712/1712 [==============================] - 23s 13ms/step - loss: 0.0017 - accuracy: 0.7494 - val_loss: 0.0013 - val_accuracy: 0.8014

Epoch 00024: val_loss improved from 0.00135 to 0.00132, saving model to weights.final_2.hdf5
Epoch 25/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0016 - accuracy: 0.7605 - val_loss: 0.0013 - val_accuracy: 0.7850

Epoch 00025: val_loss improved from 0.00132 to 0.00130, saving model to weights.final_2.hdf5
Epoch 26/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0016 - accuracy: 0.7681 - val_loss: 0.0014 - val_accuracy: 0.7991

Epoch 00026: val_loss did not improve from 0.00130
Epoch 27/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0015 - accuracy: 0.7728 - val_loss: 0.0013 - val_accuracy: 0.7780

Epoch 00027: val_loss did not improve from 0.00130
Epoch 28/50
1712/1712 [==============================] - 23s 13ms/step - loss: 0.0016 - accuracy: 0.7640 - val_loss: 0.0014 - val_accuracy: 0.8037

Epoch 00028: val_loss did not improve from 0.00130
Epoch 29/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0015 - accuracy: 0.7769 - val_loss: 0.0012 - val_accuracy: 0.7991

Epoch 00029: val_loss improved from 0.00130 to 0.00123, saving model to weights.final_2.hdf5
Epoch 30/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0014 - accuracy: 0.7652 - val_loss: 0.0012 - val_accuracy: 0.7991

Epoch 00030: val_loss improved from 0.00123 to 0.00123, saving model to weights.final_2.hdf5
Epoch 31/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0014 - accuracy: 0.7669 - val_loss: 0.0014 - val_accuracy: 0.7991

Epoch 00031: val_loss did not improve from 0.00123
Epoch 32/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0014 - accuracy: 0.7821 - val_loss: 0.0016 - val_accuracy: 0.8014

Epoch 00032: val_loss did not improve from 0.00123
Epoch 33/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0015 - accuracy: 0.7769 - val_loss: 0.0012 - val_accuracy: 0.7944

Epoch 00033: val_loss improved from 0.00123 to 0.00116, saving model to weights.final_2.hdf5
Epoch 34/50
1712/1712 [==============================] - 38s 22ms/step - loss: 0.0013 - accuracy: 0.7886 - val_loss: 0.0013 - val_accuracy: 0.8061

Epoch 00034: val_loss did not improve from 0.00116
Epoch 35/50
1712/1712 [==============================] - 29s 17ms/step - loss: 0.0014 - accuracy: 0.7757 - val_loss: 0.0012 - val_accuracy: 0.8014

Epoch 00035: val_loss did not improve from 0.00116
Epoch 36/50
1712/1712 [==============================] - 31s 18ms/step - loss: 0.0013 - accuracy: 0.7786 - val_loss: 0.0012 - val_accuracy: 0.7991

Epoch 00036: val_loss did not improve from 0.00116
Epoch 37/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0013 - accuracy: 0.7845 - val_loss: 0.0012 - val_accuracy: 0.8037

Epoch 00037: val_loss did not improve from 0.00116
Epoch 38/50
1712/1712 [==============================] - 28s 16ms/step - loss: 0.0012 - accuracy: 0.7821 - val_loss: 0.0011 - val_accuracy: 0.7921

Epoch 00038: val_loss improved from 0.00116 to 0.00110, saving model to weights.final_2.hdf5
Epoch 39/50
1712/1712 [==============================] - 21s 13ms/step - loss: 0.0012 - accuracy: 0.7950 - val_loss: 0.0011 - val_accuracy: 0.8107

Epoch 00039: val_loss improved from 0.00110 to 0.00109, saving model to weights.final_2.hdf5
Epoch 40/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0012 - accuracy: 0.7850 - val_loss: 0.0011 - val_accuracy: 0.8014

Epoch 00040: val_loss improved from 0.00109 to 0.00108, saving model to weights.final_2.hdf5
Epoch 41/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0012 - accuracy: 0.7815 - val_loss: 0.0011 - val_accuracy: 0.8037

Epoch 00041: val_loss did not improve from 0.00108
Epoch 42/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0012 - accuracy: 0.7839 - val_loss: 0.0013 - val_accuracy: 0.8201

Epoch 00042: val_loss did not improve from 0.00108
Epoch 43/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0012 - accuracy: 0.8002 - val_loss: 0.0011 - val_accuracy: 0.7991

Epoch 00043: val_loss did not improve from 0.00108
Epoch 44/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0011 - accuracy: 0.7996 - val_loss: 0.0011 - val_accuracy: 0.8364

Epoch 00044: val_loss did not improve from 0.00108
Epoch 45/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0011 - accuracy: 0.8026 - val_loss: 0.0011 - val_accuracy: 0.8014

Epoch 00045: val_loss did not improve from 0.00108
Epoch 46/50
1712/1712 [==============================] - 23s 13ms/step - loss: 0.0011 - accuracy: 0.8113 - val_loss: 0.0011 - val_accuracy: 0.8318

Epoch 00046: val_loss improved from 0.00108 to 0.00106, saving model to weights.final_2.hdf5
Epoch 47/50
1712/1712 [==============================] - 22s 13ms/step - loss: 0.0012 - accuracy: 0.7979 - val_loss: 0.0013 - val_accuracy: 0.8248

Epoch 00047: val_loss did not improve from 0.00106
Epoch 48/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0011 - accuracy: 0.7996 - val_loss: 0.0011 - val_accuracy: 0.8341

Epoch 00048: val_loss did not improve from 0.00106
Epoch 49/50
1712/1712 [==============================] - 21s 12ms/step - loss: 0.0011 - accuracy: 0.8160 - val_loss: 0.0012 - val_accuracy: 0.8248

Epoch 00049: val_loss did not improve from 0.00106
Epoch 50/50
1712/1712 [==============================] - 20s 12ms/step - loss: 0.0011 - accuracy: 0.8002 - val_loss: 0.0011 - val_accuracy: 0.8271

Epoch 00050: val_loss did not improve from 0.00106
In [20]:
model.load_weights('weights.final_2.hdf5')

Step 7: Visualize the Loss and Test Predictions

The training and validation loss of the trained neural network.

In [21]:
# Visualize the training and validation loss of the neural network
plt.plot(range(epochs), hist_final.history[
         'val_loss'], 'g-', label='Val Loss')
plt.plot(range(epochs), hist_final.history[
         'loss'], 'g--', label='Train Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

Visualize a Subset of the Test Predictions

In [22]:
y_test = model.predict(X_test)
fig = plt.figure(figsize=(20,20))
fig.subplots_adjust(left=0, right=1, bottom=0, top=1, hspace=0.05, wspace=0.05)
for i in range(9):
    ax = fig.add_subplot(3, 3, i + 1, xticks=[], yticks=[])
    plot_data(X_test[i], y_test[i], ax)

Step 8: Complete the pipeline

  • Detect the faces in this image automatically
  • Predict the facial keypoints in each face detected in the image
  • Paint predicted keypoints on each face detected

Facial Keypoints Detector

Our function should perform the following steps

  1. Accept a color image.
  2. Convert the image to grayscale.
  3. Detect and crop the face contained in the image.
  4. Locate the facial keypoints in the cropped image.
  5. Overlay the facial keypoints in the original (color, uncropped) image.
In [23]:
# Load in color image for face detection
image = cv2.imread('images/obamas4.jpg')


# Convert the image to RGB colorspace
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)


# plot our image
fig = plt.figure(figsize = (9,9))
ax1 = fig.add_subplot(111)
ax1.set_xticks([])
ax1.set_yticks([])
ax1.set_title('image')
ax1.imshow(image)
Out[23]:
<matplotlib.image.AxesImage at 0x2a29d49a288>
In [24]:
# Use the face detection code with our trained conv-net
def plot_keypoints(img_path, face_cascade_path, model_path, scale=1.2, neighbors=5, key_size=10):
    
    face_cascade=cv2.CascadeClassifier(face_cascade_path) 
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, scale, neighbors)
    fig = plt.figure(figsize=(40, 40))
    ax = fig.add_subplot(121, xticks=[], yticks=[])
    ax.set_title('Image with Facial Keypoints')

    print('Number of faces detected:', len(faces))

    image_with_detections = np.copy(img)

    for (x,y,w,h) in faces:
        cv2.rectangle(image_with_detections, (x,y), (x+w,y+h), (255,0,0), 3)
        bgr_crop = image_with_detections[y:y+h, x:x+w] 
        orig_shape_crop = bgr_crop.shape
        gray_crop = cv2.cvtColor(bgr_crop, cv2.COLOR_BGR2GRAY)
        resize_gray_crop = cv2.resize(gray_crop, (96, 96)) / 255
        model = load_model(model_path)
        landmarks = np.squeeze(model.predict(
            np.expand_dims(np.expand_dims(resize_gray_crop, axis=-1), axis=0)))
        ax.scatter(((landmarks[0::2] * 48 + 48)*orig_shape_crop[0]/96)+x, 
                   ((landmarks[1::2] * 48 + 48)*orig_shape_crop[1]/96)+y, 
                   marker='o', c='c', s=key_size)
        
    ax.imshow(cv2.cvtColor(image_with_detections, cv2.COLOR_BGR2RGB))
In [25]:
# Paint the predicted keypoints on the test image
obamas = plot_keypoints('images/obamas4.jpg',
                        'detector_architectures/haarcascade_frontalface_default.xml',
                        'my_model_final.h5')
Number of faces detected: 2
In [35]:
# Paint the predicted keypoints on the test image
obamas = plot_keypoints('images/fawzia.jpg',
                        'detector_architectures/haarcascade_frontalface_default.xml',
                        'my_model_final.h5')
Number of faces detected: 1
In [39]:
# Paint the predicted keypoints on the test image
obamas = plot_keypoints('images/pic2.jpg',
                        'detector_architectures/haarcascade_frontalface_default.xml',
                        'my_model_final.h5')
Number of faces detected: 2